177 research outputs found
An Approximate Subgame-Perfect Equilibrium Computation Technique for Repeated Games
This paper presents a technique for approximating, up to any precision, the
set of subgame-perfect equilibria (SPE) in discounted repeated games. The
process starts with a single hypercube approximation of the set of SPE. Then
the initial hypercube is gradually partitioned on to a set of smaller adjacent
hypercubes, while those hypercubes that cannot contain any point belonging to
the set of SPE are simultaneously withdrawn.
Whether a given hypercube can contain an equilibrium point is verified by an
appropriate mathematical program. Three different formulations of the algorithm
for both approximately computing the set of SPE payoffs and extracting players'
strategies are then proposed: the first two that do not assume the presence of
an external coordination between players, and the third one that assumes a
certain level of coordination during game play for convexifying the set of
continuation payoffs after any repeated game history.
A special attention is paid to the question of extracting players' strategies
and their representability in form of finite automata, an important feature for
artificial agent systems.Comment: 26 pages, 13 figures, 1 tabl
Hypergame Analysis in E-Commerce: A Preliminary Report
In usual game theory, it is normally assumed that "all the players see the same game", i.e., they are aware of each other's strategies and preferences. This assumption is very strong for real life where differences in perception affecting the decision making process seem to be the rule rather the exception. Attempts have been made to incorporate misperceptions of various types, but most of these attempts are based on quantities (as probabilities, risk factors, etc.) which are too subjective in general. One approach that seems to be very attractive is to consider that the players are trying to play "different games" in a hypergame. In this paper, we present a hypergame approach as an analysis tool in the context of multiagent environments. Precisely, we first sketch a brief formal introduction to hypergames. Then we explain how agents can interact through communication or through a mediator when they have different views and particularly misperceptions on others' games. After that, we show how agents can take advantage of misperceptions. Finally, we conclude and present some future work. Dans les jeux classiques, il est supposĂ© que "tous les joueurs voient le mĂȘme jeu'', i.e., que les joueurs sont au courant des stratĂ©gies et des prĂ©fĂ©rences des uns et des autres. Aux vu des applications rĂ©elles, cette supposition est trĂšs forte dans la mesure oĂč les diffĂ©rences de perception affectant la prise de dĂ©cision semblent plus relevĂ©es de la rĂšgle que de l'exception. Des tentatives ont Ă©tĂ© faites, par le passĂ©, pour incorporer les distorsions aux niveaux des perceptions, mais la plupart de ces tentatives ont Ă©tĂ© essentiellement basĂ©es sur le "quantitatif" (comme les probabilitĂ©s, les facteurs de risques, etc.) et par consĂ©quent, trop subjectives en gĂ©nĂ©ral. Une approche qui semble ĂȘtre attractive pour pallier Ă cela, consiste Ă voir les joueurs comme jouant "diffĂ©rents jeux'' dans une sorte d'hyper-jeu. Dans ce papier, nous prĂ©sentons une approche "hyper-jeu'' comme outil d'analyse entre agents dans le cadre d'un environnement multi-agent. Nous donnons un aperçu (trĂšs succinct) de la formalisation d'un tel hyper-jeux et nous expliquerons ensuite, comment les agents pourraient intervenir via un agent-mĂ©diateur quand ils ont des perceptions diffĂ©rentes. AprĂšs cela, nous expliquerons comment les agents pourraient tirer avantage des perceptions diffĂ©rentes.Game Theory, Hypergame, Mediation, ThĂ©orie des jeux, hyper-jeux, mĂ©diation
Dynamic Y-KD: A Hybrid Approach to Continual Instance Segmentation
Despite the success of deep learning models on instance segmentation, current
methods still suffer from catastrophic forgetting in continual learning
scenarios. In this paper, our contributions for continual instance segmentation
are threefold. First, we propose the Y-knowledge distillation (Y-KD), a
technique that shares a common feature extractor between the teacher and
student networks. As the teacher is also updated with new data in Y-KD, the
increased plasticity results in new modules that are specialized on new
classes. Second, our Y-KD approach is supported by a dynamic architecture
method that trains task-specific modules with a unique instance segmentation
head, thereby significantly reducing forgetting. Third, we complete our
approach by leveraging checkpoint averaging as a simple method to manually
balance the trade-off between performance on the various sets of classes, thus
increasing control over the model's behavior without any additional cost. These
contributions are united in our model that we name the Dynamic Y-KD network.
We perform extensive experiments on several single-step and multi-steps
incremental learning scenarios, and we show that our approach outperforms
previous methods both on past and new classes. For instance, compared to recent
work, our method obtains +2.1% mAP on old classes in 15-1, +7.6% mAP on new
classes in 19-1 and reaches 91.5% of the mAP obtained by joint-training on all
classes in 15-5
Multi-item Auctions for Automatic Negotiation
Available resources can often be limited with regard to the number of demands. In this paper we propose an approach for solving this problem which consists of using the mechanisms of multi-item auctions for allocating the resources to a set of software agents. We consider the resource problem as a market in which there are vendor agents and buyer agents trading on items representing the resources. These agents use multi-item auctions which are viewed here as a process of automatic negotiation, and implemented as a network of intelligent software agents. In this negotiation, agents exhibit different acquisition capabilities which let them act differently depending on the current context or situation of the market. For example, the "richer" an agent is, the more items it can buy, i.e. the more resources it can acquire. We present a model for this approach based on the English auction, then we discuss experimental evidence of such a model. Dans un environnement multiagent, les ressources peuvent toujours s'avérer insuffisantes relativement à un nombre élevé de demandes. Dans ce cahier, nous proposons une approche mixant les enchÚres et les agents logiciels en vue de contribuer à résoudre ce problÚme. Cette approche consiste en fait à utiliser le mécanisme d'enchÚres multi-articles en vue d'allouer les ressources à un ensemble d'agents. à cet effet, nous considérons le problÚme de ressources comme un marché dans lequel évoluent des agents acheteurs et des agents vendeurs négociant des articles représentant des ressources. Ces agents utilisent des enchÚres multi-articles et par conséquent ils constituent un processus de négociation automatisé et programmé comme un réseau d'agents logiciels. Dans ce type de négociation, chaque agent exhibe différentes capacités d'acquisition lui permettant ainsi d'agir différemment selon le contexte ou la situation de marché. Par exemple, plus on est riche, plus on peut acheter d'articles. Nous présentons pour ce modÚle une enchÚre anglaise et nous discuterons ses résultats expérimentaux.Multi-agent systems, Negotiations, Multi-item auctions, SystÚmes multiagents, négociations, enchÚres multi items
Generative Adversarial Positive-Unlabelled Learning
In this work, we consider the task of classifying binary positive-unlabeled
(PU) data. The existing discriminative learning based PU models attempt to seek
an optimal reweighting strategy for U data, so that a decent decision boundary
can be found. However, given limited P data, the conventional PU models tend to
suffer from overfitting when adapted to very flexible deep neural networks. In
contrast, we are the first to innovate a totally new paradigm to attack the
binary PU task, from perspective of generative learning by leveraging the
powerful generative adversarial networks (GAN). Our generative
positive-unlabeled (GenPU) framework incorporates an array of discriminators
and generators that are endowed with different roles in simultaneously
producing positive and negative realistic samples. We provide theoretical
analysis to justify that, at equilibrium, GenPU is capable of recovering both
positive and negative data distributions. Moreover, we show GenPU is
generalizable and closely related to the semi-supervised classification. Given
rather limited P data, experiments on both synthetic and real-world dataset
demonstrate the effectiveness of our proposed framework. With infinite
realistic and diverse sample streams generated from GenPU, a very flexible
classifier can then be trained using deep neural networks.Comment: 8 page
Multi-agent coordination based on tokens : reduction of the bullwhip effect in a forest supply chain
In this paper, we focus on the supply chain as a multi-agent system and we propose a new coordination technique to reduce the fluctuations of orders placed by each company to its suppliers in such a supply chain. This problem of amplification of the demand variability is called the bullwhip effect. To reduce such a bullwhip effect, we propose a technique based on tokens to achieve a decentralized coordination. Precisely, classical orders manage the demand itself whereas tokens manage effects on company inventory due to variations of this demand. Finally, the proposed approach is validated by the Wood Supply Game, which is a supply chain model used to make players aware of the bullwhip effect. We experimentally verify that our coordination technique leads to less variable orders (i.e. the standard deviation of orders is reduced) while inventory levels are not excessively high but sufficient to avoid backorders.
- âŠ